blog-banner

Collect and deliver data with Vector by Datadog and CockroachDB

Last edited on November 7, 2022

0 minute read

    In my technical journey, I struggled to find a simple tool that gave me the flexibility to collect and deliver data to other monitoring tools. Until, one day, I found it.

    In 2007 I was entrusted to design a script that would mitigate monetary losses for the company, despite my not having any experience designing monitoring and alerting tools (at the time). For days I struggled to create the logic with Bash scripts and CSV files for what was to become a paging alert system for Solaris (which executes critical C based code for our customers).

    The task of collecting log data keywords, fetching metrics such as CPU, disk size, number of running processes, etc., was quite overwhelming! Also at that point, there were not many robust tool options in the market. The few that existed were unattainable due to many factors including company policies, pricing, multi-alert tools requirements, and complexity.

    DevOps, SREs, and DBAs know this pain.

    The miracle toolCopy Icon

    Fast forward a few years, to my days at Datadog as a Solutions Engineer, where I came across a recently acquired versatile tool called Vector, that some Datadog customers started to adopt. The tool seemed to have all of the characteristics of the miracle tool: open source, flexible, configurable, un-intrusive, and can deliver data just about anywhere.

    During conversations with former colleagues and customers, they stressed the need for a flexible data collector that could deliver filtered, sanitized, and custom data sets to multiple monitoring tools. Now, a few months later at Cockroach Labs, I learned that Datadog can be natively integrated to monitor the self-hosted and dedicated distributed database cluster.

    Why choose Vector?Copy Icon

    So, why Vector by Datadog and not the Datadog/Splunk/Dynatrace/New Relic/Elastic/other agent? One potential caveat with multi-agent collectors is that each agent is designed to take full control of a log, causing the second agent to fight for resources on the same file if the requirement is to send the log to multiple providers (ex. Splunk, Datadog, and ElasticSearch).

    Vector had matured at Datadog, and I saw a good fit between Vector and CockroachDB for a use case that many users have asked for in the past: Delivery of data to multi-points from self-managed systems and control over the data collected.

    Vector uses “Sources” to collect data, “Transforms” to cleanse/filter/customize data, and “Sinks” to deliver data.

    vector-by-datadog

    The steps to achieve a quick installation are not difficult if we’re willing to invest a bit of time on the custom solution. The TOML example provided is for each host but can also be used on Kubernetes deployments.

    Get started with Vector and CockroachDBCopy Icon

    Step 1: Install VectorCopy Icon

    curl -1sLf 'https://repositories.timber.io/public/vector/cfg/setup/bash.deb.sh' | sudo -E bash sudo apt-get install -y vector

    Step 2: ConfigurationCopy Icon

    Configure the Vector TOML/YAML/JSON file: Sources, Transformers, Sinks: see final TOML config file sample in my GitHub repo

    Step 3: Collect logsCopy Icon

    Collect CockroachDB logs with sources (select the logs you need to collect)

    [sources.crdb_logs] type = "file" ignore_older_secs = 600 include = ["/mnt/disks/crdb-disk/cockroach-data/logs/cockroach.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-health.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-security.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-sql-audit.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-sql-auth.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-sql-exec.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-sql-slow.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-sql-schema.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-pebble.log", "/mnt/disks/crdb-disk/cockroach-data/logs/cockroach-telemetry.log"]

    Step 4: Collect dataCopy Icon

    Collect CockroachDB data from the HTTP endpoint with the Prometheus sources

    [sources.crdb_ui_console] type = "prometheus_scrape" endpoints = [ "http://localhost:8080/_status/vars" ] scrape_interval_secs = 15 instance_tag = "instance" endpoint_tag = "endpoint"

    Step 5: Collect server metricsCopy Icon

    CPU, Memory, Processes, and more with sources

    Step 6: Customize metricsCopy Icon

    Customize CockroachDB metrics (ex. CRDB to Splunk) with Transforms

    [transforms.crdb_metrics_to_logs_splunk] type = "metric_to_log" inputs = [ "crdb_ui_console" ] host_tag = "hostname"

    Step 7: Endpoint alerting toolCopy Icon

    Determine the delivery endpoint alerting tool or platforms (many providers have free trials to access their platforms) with sinks (ex. sending data to Splunk and Datadog)

    ## Splunk [sinks.logs_to_splunk_hec] type = "splunk_hec_logs" inputs = [ "crdb_metrics_to_logs_splunk" ] endpoint = "https://192.168.86.30:8088" host_key = "hostname" indexed_fields = [ "field1" ] compression = "gzip" default_token = "${SPLUNK_HEC_TOKEN}" tls.verify_certificate = false [sinks.to_splunk_hec_logs.encoding] codec = "json" [sinks.metrics_to_splunk_hec] type = "splunk_hec_metrics" inputs = [ "crdb_node_metrics" ] endpoint = "https://192.168.86.30:8088" host_key = "hostname" index = "{{ host }}" source = "{{ file }}" compression = "gzip" default_token = "${SPLUNK_HEC_TOKEN}" sourcetype = "{{ sourcetype }}" tls.verify_certificate = false ## Datadog [sinks.crdb_dd_metrics] type = "datadog_metrics" inputs = [ "crdb_ui_console", "crdb_node_metrics" ] default_api_key = "${DATADOG_API_KEY}" [sinks.crdb_dd_logs] type = "datadog_logs" inputs = [ "crdb_logs"] default_api_key = "${DATADOG_API_KEY}" region = "us" compression = "gzip" site = "datadoghq.com" tags = ["source:vector","env:dev","collector:live process"]

    Step 8: TestCopy Icon

    Test the Vector configuration before using systemctl

    sudo vector --config vector.toml

    Step 9: DeploymentCopy Icon

    Move the vector.toml file to the Vector directory and deploy Vector via the Linux systemctl

    sudo mv vector.toml /etc/vector/vector.toml sudo systemctl start vector

    While there are paid and free tools out in the market that offer many capabilities, Vector is community maintained and offers flexibility to capture custom data and logs. The downside is a learning curve, trial, and error of converting data (which might be treated as log) and converting it to metrics. The upside of Vector is its flexibility to deliver parsed data to different provider endpoints.

    So what do we gain by using Vector in our self-hosted CockroachDB clusters? We gain the flexibility to customize sending cleansed metrics to more than a single provider without being tied to a specific one. The all or nothing option to collect logs and metrics does not apply to Vector. Long gone are the days in which we were limited to collect metrics and alerts on anomalies of our systems.

    So what can Vector by Datadog do for my CockroachDB Self-hosted cluster? It helps me provide monitoring data delivery for multiple teams without overburdening my infrastructure with agents fighting for the same resource.

    monitoring